Search CORE

42 research outputs found

POSH: Paris OpenSHMEM: A High-Performance OpenSHMEM Implementation for Shared Memory Systems

Author: Coti Camille
Publication venue
Publication date: 30/03/2014
Field of study

In this paper we present the design and implementation of POSH, an Open-Source implementation of the OpenSHMEM standard. We present a model for its communications, and prove some properties on the memory model defined in the OpenSHMEM specification. We present some performance measurements of the communication library featured by POSH and compare them with an existing one-sided communication library. POSH can be downloaded from \url{http://www.lipn.fr/~coti/POSH}. % 9 - 67Comment: This is an extended version (featuring the full proofs) of a paper accepted at ICCS'1

arXiv.org e-Print Archive

Elsevier - Publisher Connector

QR Factorization of Tall and Skinny Matrices in a Grid Computing Environment

Author: Camille Coti
Camille Coti
Camille Coti
Emmanuel Agullo
Emmanuel Agullo
Emmanuel Agullo
Jack Dongarra
Jack Dongarra
Jack Dongarra
Julien Langou
Julien Langou
Qr Fac
Thomas Herault
Thomas Herault
Thomas Herault
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/12/2009
Field of study

Previous studies have reported that common dense linear algebra operations do not achieve speed up by using multiple geographical sites of a computational grid. Because such operations are the building blocks of most scientific applications, conventional supercomputers are still strongly predominant in high-performance computing and the use of grids for speeding up large-scale scientific problems is limited to applications exhibiting parallelism at a higher level. We have identified two performance bottlenecks in the distributed memory algorithms implemented in ScaLAPACK, a state-of-the-art dense linear algebra library. First, because ScaLAPACK assumes a homogeneous communication network, the implementations of ScaLAPACK algorithms lack locality in their communication pattern. Second, the number of messages sent in the ScaLAPACK algorithms is significantly greater than other algorithms that trade flops for communication. In this paper, we present a new approach for computing a QR factorization -- one of the main dense linear algebra kernels -- of tall and skinny matrices in a grid computing environment that overcomes these two bottlenecks. Our contribution is to articulate a recently proposed algorithm (Communication Avoiding QR) with a topology-aware middleware (QCG-OMPI) in order to confine intensive communications (ScaLAPACK calls) within the different geographical sites. An experimental study conducted on the Grid'5000 platform shows that the resulting performance increases linearly with the number of geographical sites on large-scale problems (and is in particular consistently higher than ScaLAPACK's).Comment: Accepted at IPDPS10. (IEEE International Parallel & Distributed Processing Symposium 2010 in Atlanta, GA, USA.

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1

MPI Applications on Grids: A Topology-Aware Approach

Author: Cappello Franck
Coti Camille
Herault Thomas
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

Large Grids are build by aggregating smaller parallel machines through a public long-distance interconnection network (such as the Internet). Therefore, their structure is intrinsically hierarchical. Each level of the network hierarchy gives performances which differ from the other levels in terms of latency and bandwidth. MPI is the de facto standard for programming parallel machines, therefore an attractive solution for programming parallel applications on this kind of grids. However, because of the aforementioned differences of communication performances, the application continuously communicates back and forth between clusters, with a significant impact on performances. In this report, we present an extension of the information provided by the run-time environment of an MPI library, a set of efficient collective operations for grids and a methodology to organize communication patterns within applications with respect to the underlying physical topology, and implement it in a geophysics application

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

A task-based approach to parallel parametric linear programming solving, and application to polyhedral computations

Author: Coti Camille
Monniaux David
Yu Hang
Publication venue
Publication date: 29/09/2020
Field of study

Parametric linear programming is a central operation for polyhedral computations, as well as in certain control applications.Here we propose a task-based scheme for parallelizing it, with quasi-linear speedup over large problems.This type of parallel applications is challenging, because several tasks mightbe computing the same region. In this paper, we are presenting thealgorithm itself with a parallel redundancy elimination algorithm, andconducting a thorough performance analysis.Comment: arXiv admin note: text overlap with arXiv:1904.0607

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

MPI Applications on Grids: A Topology-Aware Approach

Author: Coti Camille
Herault Thomas
Cappello Franck
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

HAL - Lille 3

Crossref

Data Coherency in Distributed Shared Memory

Author: Butelle Franck
Coti Camille
Publication venue: Higashi Hiroshima : Dept. of Computer Engineering, Hiroshima University
Publication date: 01/01/2012
Field of study

International audienceWe present a new model for distributed shared memory systems, based on remote data accesses. Such features are offered by network interface cards that allow one-sided operations, remote direct memory access and OS bypass. This model leads to new interpretations of distributed algorithms allowing us to propose an innovative detection technique of race conditions only based on logical clocks. Indeed, the presence of (data) races in a parallel program makes it hard to reason about and is usually considered as a bug

HAL-Paris 13